Skip to content

[FLINK-39485][filesystem/s3] Support bucket root using HeadBucket for empty object key#27964

Closed
macdoor wants to merge 1 commit intoapache:masterfrom
macdoor:FLINK-39485-native-s3-bucket-root
Closed

[FLINK-39485][filesystem/s3] Support bucket root using HeadBucket for empty object key#27964
macdoor wants to merge 1 commit intoapache:masterfrom
macdoor:FLINK-39485-native-s3-bucket-root

Conversation

@macdoor
Copy link
Copy Markdown

@macdoor macdoor commented Apr 18, 2026

What is the purpose of the change

Fix NativeS3FileSystem.getFileStatus for warehouse-style URIs s3://bucket (empty object key). AWS SDK v2 rejects HeadObject with an empty key (Key cannot be empty), which breaks catalog creation / existence checks against S3-compatible storage (e.g. MinIO).

Brief change log

  • For an empty object key, use HeadBucket and return a directory FileStatus when the bucket exists.
  • Handle NoSuchBucketException as FileNotFoundException.
  • Add S3ClientProvider.createForTesting and NativeS3FileSystemBucketRootTest (mocked S3Client) to lock in the HeadBucket vs HeadObject behavior.

Verifying this change

This change added tests and can be verified as follows:

  • mvn -pl flink-filesystems/flink-s3-fs-native test -Dtest=NativeS3FileSystemBucketRootTest
  • Previously validated manually against MinIO: creating a Paimon catalog with warehouse = 's3://<bucket>' (no prefix) succeeds with this patch.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): yes — test scope mockito-core (version from parent ${mockito.version})
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no / don't know
  • Anything that affects deployment or recovery: no
  • The S3 file system connector: yes

Documentation

  • Does this pull request introduce a new feature? no

@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented Apr 18, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@spuru9
Copy link
Copy Markdown
Contributor

spuru9 commented Apr 18, 2026

@macdoor Can you alter the PR Description to match as per the default guidelines https://github.com/apache/flink/blob/master/.github/PULL_REQUEST_TEMPLATE.md

Copy link
Copy Markdown
Contributor

@spuru9 spuru9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there dont exist any tests for the file, its worth adding a small test to establish the pattern. or else can you add in PR description how the fix was validated against a
real bucket.

@RocMarshal
Copy link
Copy Markdown
Contributor

Hi, @Samrat002 Could you help take a look ? Thank you.

@github-actions github-actions Bot added the community-reviewed PR has been reviewed by the community. label Apr 18, 2026
@macdoor macdoor force-pushed the FLINK-39485-native-s3-bucket-root branch 3 times, most recently from 37c33b3 to e91ae46 Compare April 18, 2026 09:31
… empty object key

HeadObject requires a non-empty object key in AWS SDK v2. Warehouse URIs
such as s3://my-bucket yield an empty key from NativeS3AccessHelper.extractKey,
which caused SdkClientException when creating catalogs or checking paths.

Use HeadBucket for the bucket root and return a directory FileStatus.
Handle NoSuchBucketException as FileNotFoundException.

Add S3ClientProvider.createForTesting and NativeS3FileSystemBucketRootTest
(DelegatingS3Client-based test double) to verify HeadBucket is used and
HeadObject is not called for the bucket root.

https: //issues.apache.org/jira/browse/FLINK-39485
Made-with: Cursor
@macdoor macdoor force-pushed the FLINK-39485-native-s3-bucket-root branch from e91ae46 to 1bc7bc8 Compare April 18, 2026 10:13
@macdoor
Copy link
Copy Markdown
Author

macdoor commented Apr 18, 2026

@flinkbot run azure

@Samrat002
Copy link
Copy Markdown
Contributor

I have checked and tried with the S3 bucket. I was not able to reproduce the issue.
@macdoor, can you confirm the issue described is only observable with minio, not S3

@macdoor
Copy link
Copy Markdown
Author

macdoor commented Apr 20, 2026

@Samrat002 You are right. I can reproduce this on MinIO with warehouse = s3:// (empty key), not S3

@Samrat002
Copy link
Copy Markdown
Contributor

Samrat002 commented Apr 20, 2026

thank you, @macdoor, for confirming. 🙌🏻

let's discuss with the community on

  1. Shall we continue 100% support for Minio as part of native-s3-fs, given that Minio has moved away from the Apache 2.0 license ?
  2. Adding support s3 compitable filesystem in native-s3-fs adds code complexity are we willing to keep such maintainance overhead

IMO native-s3-fs should support only pure S3 (not S3 compatible fs).

@gaborgsomogyi
Copy link
Copy Markdown
Contributor

Is this a bug for the vanilla s3 integration or just minio support?

@Samrat002
Copy link
Copy Markdown
Contributor

This is not a bug for vanilla S3 integration. I have validated it. This issue is not reproducing with vanilla S3.
It only arises and is reproducible with minio.

@macdoor macdoor closed this Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants